Towards speech-to-text translation without speech recognition

نویسندگان

  • Sharon Goldwater
  • Adam Lopez
  • Sameer Bansal
  • Herman Kamper
چکیده

We explore the problem of translating speech to text in low-resource scenarios where neither automatic speech recognition (ASR) nor machine translation (MT) are available, but we have training data in the form of audio paired with text translations. We present the first system for this problem applied to a realistic multi-speaker dataset, the CALLHOME Spanish-English speech translation corpus. Our approach uses unsupervised term discovery (UTD) to cluster repeated patterns in the audio, creating a pseudotext, which we pair with translations to create a parallel text and train a simple bag-of-words MT model. We identify the challenges faced by the system, finding that the difficulty of cross-speaker UTD results in low recall, but that our system is still able to correctly translate some content words in test data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

A New Approach to Speech-Input Statistical Translation

The statistical pattern recognition is a promising framework for text-to-text translation. However, a natural extension to speech-input translation is not straightforward. In this paper, we present a method to deal with the speech input statistical translation problem that could be considered as a step towards a fully integrated recognition-translation procedure. In this version a word graph wa...

متن کامل

Towards real-time multilingual multimodal speech-to-speech translation

Speech-to-speech translation technology enables natural oral communication between different language speaking people. Many research projects have addressed speech-to-speech translation (S2ST) technology, such as ATR [1], VERBMOBIL [2], C-STAR [3], NESPOLE! [4], BABYLON [5], GALE [6], and EU-bridge [7]. The speechto-speech translation system is normally composed of automatic speech recognition ...

متن کامل

The Effect of Private Speech and Self-Regulation on Translation Quality among Iranian Translation Students: A Mixed-Methods Study

The current study presents findings from a mixed-methods study of investigating the self-regulatory role of private speech (self-talk) on students’ translation quality. The aim of the study was to validate the adapted version of a self-verbalization questionnaire. The construct validity and reliability of the scale were supported by the CFA which revealed that all items reached the acceptable f...

متن کامل

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

Current speech translation systems integrate (loosely or closely) two main modules: source language speech recognition (ASR) and source-to-target text translation (MT). In these approaches, source language text transcript (as a sequence or as a graph) appears as mandatory to produce a text hypothesis in the target language. In the meantime, deep neural networks have yielded breakthroughs in dif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017